feat(conversations): server-authoritative resumable chat turns + tab rewrite by Ovaculos · Pull Request #302 · NimbleBrainInc/nimblebrain

Ovaculos · 2026-05-27T17:52:03Z

Summary

Rewrite the conversation tab around a server-authoritative resumable turn model (issue #254). A chat turn now runs to completion on the server regardless of the client's connection; the browser is a viewer that attaches to a per-conversation event log, replays the in-flight turn, and tails live events.

Closes #254. Closes #294 (Stop button + server-side run cancellation decoupled from transport). Includes the #253 follow-up (auto-title prompt) — upstream landed an XML-containment + sanitizer approach in #261 that is preserved here.

What this delivers

RunBus (src/runtime/run-bus.ts) — per-conversation in-memory log of buffered events with monotonic seq, ownership of the turn's AbortController, and a grace window for late re-attach. 500k per-run event cap with terminal error on overflow.
Detached startTurn — POST /v1/chat/start returns {conversationId} immediately; the turn keeps running across disconnect / refresh / tab switch. Stop button → POST /v1/conversations/:id/cancel.
Per-conversation SSE viewer — GET /v1/conversations/:id/events replays from a client-supplied ?afterSeq=N then tails. Reconnect/backoff handled in conversation-stream.ts.
Per-conversation slice store (web/src/hooks/chat-store.ts) — useSyncExternalStore-backed module singleton, LRU cap, hasPendingTail for resume reconcile, probeConversation for sessionStorage-restored streaming dots.
Stage-1 conversation persistence at {workDir}/conversations/{id}.jsonl, user-scoped ownership, check:conversation-paths enforces it.
Conversations bundle UI: background-refresh list, in-place title update via a new synapse/conversation-title postMessage channel (no full list refetch on title resolve).

Known limitation: RunBus is single-process

Under platform.replicas > 1, a viewer landing on a different pod than the one that started the turn sees isActive:false and the live frames don't fan out cross-pod. Sticky routing on Mcp-Session-Id (existing prereq) mitigates the active-tab case; a pod restart or cross-pod viewer still drops resume mid-turn. A clustered Redis-backed RunBus is deferred — tracked in src/runtime/run-bus.ts. serve warns at boot when sessionStore.type === "redis" so the gap is visible.

Other fixes folded into this PR

Stop button now emits cancelled frame to live viewers (was: just aborted, viewer stuck isStreaming:true).
startTurn serializes create to prevent double-create race when client supplies an id.
ConversationCorruptedError → 422 in handleChatStart (was: 500).
Log sinks (structured-log-sink, workspace-log-sink) wrap writes in try/catch + warn-once-per-episode (was: throw into the event-emit path and crash detached turns).
Auto-title generates and live-updates via new conversation.title SSE event (with wsId scoping). List iframe receives via dedicated postMessage; chat panel header updates via existing chat-store hook.
Resume race: partial on-disk turn now completes from RunBus grace replay (no flicker). Three-way client reconcile (isActive / activeSeq>0 / GC'd).
Stuck streaming after >30s reconnect: server-authoritative isActive:false on subscribed frame drives chat-store reset.

Tests & hygiene

Runtime.start({}) now hard-errors under NODE_ENV=test when workDir is missing. Stops integration tests from silently writing echo-model conversations into the developer's real ~/.nimblebrain. Fixed 8 leakers + added test/helpers/test-workdir.ts.
scripts/check-code-style.ts skips node_modules/ and src/bundles/ per its doc comment (was: scanned vendored .d.ts after build:bundles, surfacing 1625 false-positive violations).
20+ new tests covering chat-store slice semantics, partial-tail-completes-from-grace, complete-tail-no-flicker, RunBus overflow cap, etc.

Test plan

bun run verify — full CI parity, all green
Manual: start a turn, refresh mid-stream — assistant message reconstructs from disk + RunBus replay, no token loss
Manual: start a turn, switch to a different conversation mid-stream, switch back — no bleed, response still streaming on origin
Manual: start a turn, close the tab, reopen — title pops in live when auto-title resolves
Manual: Stop button — cancellation event arrives, streaming dot clears across all open tabs
Verify sessionStore.type: "redis" startup warning fires when configured

🤖 Generated with Claude Code

Decouple a chat turn's lifecycle from the originating HTTP request. A turn now runs to completion on the server regardless of the client connection; the client is a viewer that attaches to a replayable per-conversation event log (RunBus), replays the in-flight turn, then tails live events. Refresh / conversation-switch / disconnect never lose or duplicate work — they detach and re-attach. Only the Stop button (RunBus.cancel) aborts generation. - RunBus: in-memory, replayable, monotonic-seq turn log with grace-window retention for late re-attach; threads its own AbortSignal into the engine. - runtime.startTurn: detached turn driver; seeds user.message into the log; reports isTurnActive / activeConversationIds / turnSeq / getTurnReplay. - Per-conversation SSE: ConversationEventManager + /v1/conversations/:id/events route with subscribed{isActive} frame + buffered replay; /v1/chat/start and cancel endpoints. - Broadcast conversation.title + data.changed on the global SSE when a title generates, so viewers update without a turn-stream connection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…imbleBrainInc#253) The single-string transcript prompt made the fast model pattern-match a conversation to continue and emit the start of the assistant's response as the title (worst on creative/long answers). Use real role turns (user → assistant → user-instruction); the trailing user turn is an unambiguous instruction, not text to continue. Closes NimbleBrainInc#253 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the single shared useChat with a per-conversation slice store. Each conversation owns its own state; a turn's stream writes only into its origin slice, so switching conversations mid-turn never bleeds the response into the destination chat (NimbleBrainInc#254) and switching back shows it still arriving. - chat-store: slice map (LRU-capped) backing useChat via useSyncExternalStore; sendTurn → /v1/chat/start then subscribe via conversation-stream (replay + live tail). Resume reflects server isActive (indicator/Stop survive reload), trims the stale in-flight turn from disk, and drops grace-buffer replay of a finished turn (no duplicate). AbortSignal threaded for cleanup only. - Streaming dots: store streamingIds → hostContext (SlotRenderer) so the list shows live per-row activity; persisted per-tab in sessionStorage and re-probed on reload (dots survive refresh, self-heal when finished). - Reload restore: last-viewed conversation reopened from sessionStorage. - Live title: consume conversation.title SSE → slice title; ChatPanel header prefers the generated title. MessageList lands the view at the bottom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Per-row streaming indicator: read host-pushed streamingConversationIds (useHostContext) and render a pulsing dot on conversations with an in-flight turn. - No more list flicker on updates: data.changed refreshes run in the background (no skeleton swap) and only for conversation changes — other apps' data.changed are ignored. Skeleton stays for initial load + view switches. - New conversations appear immediately: ConversationIndex.flushPending() picks up a just-created file on the read path instead of racing the fs.watch debounce. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The per-conversation viewer replaced the old streaming path, leaving the client-side pieces unreferenced: - client.ts: streamChat / streamChatMultipart / consumeSSEStream (callers now use startChatTurn + connectConversationStream). - conversation-sse.ts and useConversationEvents.ts (the old cross-tab SSE hook) — zero importers. The server `/v1/chat/stream` + handleChatStream stay (still a valid REST surface, exercised by integration tests). Also comment the one ported `as unknown as` cast for the done-payload `files` field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…prompt change The capturing-model test helper skipped auto-title model calls by matching the old prompt string ("Generate a 3-6 word title"). The NimbleBrainInc#253 rework changed the title system prompt, so the guard stopped recognizing title calls and they polluted the captured chat system prompt. Match the new prompt instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Detached turns emit events after the originating request ends; if the workdir is gone (test teardown) or a write otherwise fails (disk full, perms), the unguarded appendFileSync threw out of emit() as an unhandled error. Wrap the write best-effort — a failed log line must never crash the caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

RunBus.cancel() flips the run to terminal synchronously, after which publish() is a no-op. The engine's post-abort `cancelled` publish (startTurn's catch) therefore never reached the SSE feed (onTurnEvent), so the Stop button aborted generation but left the UI stuck streaming until a reload. done/error work because they publish before ending while the run is still active. Publish `cancelled` in cancelTurn BEFORE RunBus.cancel ends the run. The engine's later cancelled publish becomes a harmless no-op. Adds a regression test on the live onTurnEvent path (the attach/onEnd path the old tests used did not cover this). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

startTurn → store.load throws ConversationCorruptedError for a pre-migration (ownerless) conversation on the resume path, but handleChatStart let it fall through to a raw 500. handleChat and handleChatCancel both already map it to 422 via conversationCorruptedResponse — match them. Adds an HTTP test asserting /v1/chat/start on an ownerless conversation returns 422. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two concurrent starts with the same not-yet-existing conversationId could both load()->null then both create({id}) — and create is a truncating writeFile with no exists-guard, so the loser would clobber the winner's in-progress file. runBus.begin serializes the run, but it ran AFTER create, too late to help. For a provided id, reserve the run (begin) BEFORE touching storage: the loser now throws RunInProgressError before it can create. On any load/create failure we evict to release the reservation so a failed start can't leave the id stuck "running". Adds a regression test (delayed load forces the window; asserts create runs exactly once). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

connectConversationStream reconnects with backoff up to 30s; RunBus GCs a terminal turn 30s after it ends. If a viewer is disconnected past that window while the turn finishes, the reconnect's subscribed{isActive:false} carries no replayed terminal frame — and the client only reconciled on the first (resume) subscribed, so the slice stayed isStreaming=true until a manual reload. Reconcile on any subscribed frame: if the server reports no active turn while the slice still thinks it's streaming, clear the streaming state (the server is authoritative). A terminal frame still within grace replays right after and finalizes the content; if GC'd, the partial is kept and a reload fetches the final — either way the spinner no longer hangs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…icker) loadConversation reads disk then subscribes. A short turn ending in that window left a partial snapshot, and onSubscribed hit the drop branch — discarding the grace-buffer replay that held the full `done`, so the final response only showed after a reload. Mark the gap precisely instead of always recombining (which would flicker every recently-finished conversation on open): the reader flags an assistant turn with no run.done/run.error as `pending`. On resume the client now branches on it: - live turn, or pending tail + retained run (activeSeq>0) → trim + apply replay (rebuilds the full turn, no duplicate); - pending tail + run GC'd → refetch the now-complete transcript; - complete tail → ignore the grace replay (no dup, no flicker — the common open-a-recent-conversation path is untouched). A trailing user message (no assistant yet) also counts as pending. probe slices have no messages, so they're unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

conversation-stream viewers track the RunBus seq (id: line) for replay/resume. The legacy /v1/chat and /v1/chat/stream endpoints fan out via broadcastToConversation, which is seq-less — those frames apply live but don't advance lastSeq and have no resume. The web shell is RunBus-only; only an external caller hitting the legacy endpoints while a web tab watches the same conversation would mix the two. Document the boundary on both sides. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The earlier hardening (commit 9ddfd2e) silenced log-sink throws to keep emit() safe, but a real disk-full / perms incident produced zero operator signal. Surface the first write failure of an episode with console.warn, then suppress until a subsequent success re-arms — first failure visible, sustained outage doesn't spam, intermittent recovery re-warns when it breaks again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

A viewer landing on a different pod than the one running the turn sees isActive:false and silently drops resume — RunBus is single-process and isn't yet clustered. Sticky routing on Mcp-Session-Id (prereq NimbleBrainInc#2) mitigates for the active tab; a pod restart or any cross-pod viewer still loses the in-flight turn. Two changes so this isn't a silent regression: - AGENTS.md: add a "Known limitation under replicas > 1" note under the four prereqs, pointing at run-bus.ts and the deferred clustered RunBus. - serve.ts: emit a loud startup warn when sessionStore.type === "redis" (the one strong signal of multi-replica intent) so operators see the gap at boot. Not a hard error — sticky routing is sufficient for many cases, and operators may accept the limit until the clustered RunBus lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…owth A runaway or adversarial event producer — a model stream looping pathologically, a chatty tool spamming progress events — could push log.events past safe memory limits before the grace-window GC fired. Tens of thousands of token deltas held in-process for the grace window on a multi-tenant deployment is real heap pressure. Cap at 500k events per run with discard-as-error semantics: on overflow, abort the turn's signal, append a synthetic terminal `error` event with `error: "buffer_overflow"` so viewers see the cause, end the run as `error`, and warn once. Subsequent publishes drop via the standard `status !== "running"` guard. 500k sits ~10x over the worst legit run we can measure (extended-thinking turn with chatty tool progress: ~30-50k events). Hitting it requires the kind of pathological producer that is exactly what this cap exists to terminate cleanly. Configurable via the RunBus constructor for tests; production leaves it at the default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`Runtime.start({})` defaulted `workDir` to `~/.nimblebrain`, so any integration test that forgot the field silently wrote echo-model conversations, test workspaces, and bundle data straight into the developer's real workdir. They then showed up in the conversations tab on `bun run dev` — confusing diagnosis that looked like a title-pipeline bug but was really 268 of 282 convs being test fixtures with `workspaceId: "ws_test"`. `resolveWorkDir` now throws under `NODE_ENV=test` (bun:test sets this automatically) when `workDir` is missing, with a clear example fix. Catches future regressions of this category at construction time. Eight test files were leaking: - chat-stream-concurrent - appcontext-wiring (3 Runtime.starts) - dependency-checking (6 starts; share file-scope testDir) - conversation-integration - remote-integration (3 starts) - runtime (7 starts; one needed its own workDir to avoid skill bleed from earlier same-file tests writing to the shared global skill dir) - security/workspace-isolation - usage-integration New helper `test/helpers/test-workdir.ts` exposes `makeTestWorkDir(label)` that returns `{workDir, cleanup}` for tests that want per-test isolation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The runtime fired `emitConversationsChanged()` three times on a first-turn chat: when isNew (sidebar row appears), at .finally (terminal state), and again when the title resolved. Each broadcast triggered the conversations-list iframe to refetch the whole list. The title-resolve refetch is now redundant. The `conversation.title` SSE event already exists (the chat panel header consumes it for live title display); the list iframe just didn't have a channel for it. Add one: the web shell forwards `conversation.title` via postMessage to matching iframes, and the conversations bundle listens with a raw `window.addEventListener` (the synapse SDK doesn't know this envelope, which is fine — its own listener ignores unrecognized methods, no double-handling). The iframe patches the matching row's title in-place, no list refetch. Drop the now-redundant `emitConversationsChanged()` from the title-resolve handler. Broadcast count drops from 3 to 2 on first turn; subsequent turns unchanged at 1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…dle src Two unrelated cleanups: - `web/src/api/client.ts`: comments still referenced the deleted `streamChat` / `streamChatMultipart` helpers (uploadResource header pattern note + startChatTurn docblock). Rot-flagged only; no behavior change. - `scripts/check-code-style.ts`: the doc comment says "bundles are out of scope" but the glob walked `src/**/*.ts`, which after `bun run build:bundles` pulls in `src/bundles/*/ui/node_modules/` and surfaces 1625 false-positive inline-type-import violations in vendored `.d.ts` files. Skip `node_modules/` and `src/bundles/` per the documented scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tab-rework # Conflicts: # src/api/routes/chat.ts # src/conversation/auto-title.ts # web/src/api/client.ts # web/src/api/conversation-sse.ts # web/src/components/AppWithChat.tsx # web/src/components/ChatPanel.tsx # web/src/context/ChatContext.tsx # web/src/hooks/useChat.ts # web/test/inlineError.test.tsx # web/test/streamingState.test.tsx

…tab-rework Conflicts resolved in: - web/test/inlineError.test.tsx - web/test/streamingState.test.tsx Adopted upstream's realClient snapshot pattern from web/test/setup.ts (safer than `await import("../src/api/client")` per the setup.ts comment) while preserving the branch's server-authoritative mocks (startChatTurn / startChatTurnMultipart / cancelChatTurn + connectConversationStream).

Ovaculos and others added 21 commits May 25, 2026 10:25

mgoldsborough mentioned this pull request May 31, 2026

fix(conversations): broadcast new chat to list live (#155) #255

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(conversations): server-authoritative resumable chat turns + tab rewrite#302

feat(conversations): server-authoritative resumable chat turns + tab rewrite#302
Ovaculos wants to merge 21 commits into
NimbleBrainInc:mainfrom
Ovaculos:feat/conversations-tab-rework

Ovaculos commented May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ovaculos commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this delivers

Known limitation: RunBus is single-process

Other fixes folded into this PR

Tests & hygiene

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Ovaculos commented May 27, 2026 •

edited

Loading